14 research outputs found

    Structure Learning in Audio

    Get PDF

    Synchronization and comparison of Lifelog audio recordings

    Get PDF

    Pitch Based Sound Classification

    Get PDF

    VOCAL SEGMENT CLASSIFICATION IN POPULAR MUSIC

    Get PDF
    This paper explores the vocal and non-vocal music classification problem within popular songs. A newly built labeled database covering 147 popular songs is announced. It is designed for classifying signals from 1sec time windows. Features are selected for this particular task, in order to capture both the temporal correlations and the dependencies among the feature dimensions. We systematically study the performance of a set of classifiers, including linear regression, generalized linear model, Gaussian mixture model, reduced kernel orthonormalized partial least squares and K-means on cross-validated training and test setup. The database is divided in two different ways: with/without artist overlap between training and test sets, so as to study the so called ‘artist effect’. The performance and results are analyzed in depth: from error rates to sample-to-sample error correlation. A voting scheme is proposed to enhance the performance under certain conditions

    Pitch Based Sound Classification A master’s thesis by

    No full text
    The fact that different sound environments need different sound processing is no secret, but how to select between the different programs is very different from hearing aid to hearing aid. Complete automatic and reliable classification is desirable, because many hearing aid users are not able to select programs themselves. In this project the emphasis is on classification based on the pitch of the signal, and three classes, music, noise and speech, is used. Unfortunately pitch is not straightforward to extract, and the first part of the project is about finding a suitable pitch detector. A new pitch detector is suggested based on two existing algorithms, pattern match with envelope detection and the harmonic product spectrum. The new algorithm is compared to a Bayesian algorithm and HMUSIC, and is found to perform better for classification purposes. Features are extracted from the signal produced by the pitch detector. Apart from the pitch itself, the error from the pitch detector is used to get a measure of how well the extracted pitch describes the signal, i.e. whether the signal is pitched or not. A total o

    Abstract

    No full text
    The classic LifeLog is a data structure for accumulation of the total personal information flow. While earlier research on LifeLogs have focused on the individual level we are interested in collective properties of multipled LifeLogs. In particular we focus on the integration of LifeLog audio components for a small group of coworkers occupying a simple indoor environment. We consider a scenario in which each participant is wired with a single microphone. We approach the organization of the set of recordings as a cocktail party problem with variable mixing matrix. To identify short term social interaction patterns we estimate sparse ICA separation matrices and use the structure to make inferences: A ‘non-zero ’ element indicates that a speaker is present in a given microphone.
    corecore